WHAftT IS CLAIMED IS: 

1 _Jfoo^ / ^ met ^ oc ^ ^ or P rocess i n § a matrix of elements in a processor, the 

2 method comprising steos of: 

3 loading aVirst subset of matrix elements from a first location; 

4 loading a stecond subset of matrix elements from a second location; 

5 storing a thiVd subset of matrix elements in a first destination; and 

6 storing a fourth subset of matrix elements in a second destination, wherein 

7 the loading and storing steps nesult from a first instruction issue. 

1 2. The method for processing the matrix of elements in the processor 

2 as recited in claim 1, wherein n aub-instructions perform an n-by-n matrix transpose. 

1 3. The methoa for processing the matrix of elements in the processor 

2 as recited in claim 1, wherein the first loading step is performed with a first processing 

3 path and the second loading step is nerformed with a second processing path. 

1 4. The method foiYprocessing the matrix of elements in the processor 

2 as recited in claim 1, further comprising the steps of: 

3 loading a fifth subset of matrix elements from a fifth location; 

4 loading a sixth subset of matrix elements from a sixth location; 

5 storing a seventh subset of matrix elements in a third destination; and 

6 storing a eighth subset of marnx elements in a fourth destination. 

1 5. The method for processing the matrix of elements in the processor 

2 as recited in claim 4, wherein the loading and scoring steps introduced in claim. 4 result 

3 from a second instruction issue. \ 

1 6. The method for processing the matrix of elements in the processor 

2 as recited in claim 4, wherein each of the first through fourth destination include a matrix 

3 column. \ 

1 7. The method for processing the matrix of elements in the processor 

2 as recited in claim 1, wherein each of the first through fourth locations include a matrix 

3 row. \ 
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t£jp^> & The method for processing the matrix of elements in the processor 
as recited in claim 1, wherein the third and fourth subsets each comprise elements from 
the first and second subsets. 

9. \ A processing core for transposing a matrix, comprising: 

a first' source location comprising a first plurality of matrix elements; 
a secondNsource register comprising a second plurality of matrix elements; 
a third source register comprising a third plurality of matrix elements; 
a fourth soiu\ee register comprising a fourth plurality of matrix elements; 
a first destination register comprising a fifth plurality of matrix elements; 
a second destination register comprising a sixth plurality of matrix 

elements; 

a first processing piath coupled to the first through fourth source registers 
and the first destination register; anc 

a second processing pkth coupled to the first through fourth source 
registers and the second destination register. 

10. The processing cose for transposing the matrix of claim 9, wherein: 
the first through fourth registers each include a plurality of source fields, 



and 



fields, and 



1 
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each source field includes a matrix element. 

1 1 . The processing core for transposing the matrix of claim 9, wherein: 
the first and second destination registers each include a plurality of result 

each source field includes a matrix element. 

12. The processing core for transposing the matrix of claim 9, further 
first and second instruction processors; anq 

an exchange path between the first and secctad instruction processors. 



1 13. The processing core for transposing thk matrix of claim 9, wherein 

2 the first processing path receives a first sub-instruction and the^econd processing path 

3 receives a second sub-instruction. 
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0J$ / 14. The processing core for transposing the matrix of claim 9, wherein 
each of the first Uirough fourth source registers include a matrix row. 



1 15.\ The processing core for transposing the matrix of claim 9, wherein 

2 each of the first and second destination registers include a matrix column. 



16. \The processing core for transposing the matrix of claim 9, wherein 
the first and second destination registers are addressed by a first and second sub- 
instructions which are included in a very long instruction word. 

17. A method for processing a matrix of elements, the method 
comprising steps of: 

loading a first instruction; 

loading a second instruction, wherein the first and second instructions 
address a first source register, second source register, third source register, fourth source 
register, first destination register ana second destination register; 

loading a third instruction; 

loading a fourth instructrbn, wherein the third and fourth instructions 
address the first source register, the second source register, the third source register, the 
fourth source register, a third destination register and a fourth destination register; 

storing a first element of the first source register in the first destination 
register; and \ 

storing a fourth element of the first source register in the fourth destination 
register, wherein a plurality of the first through fourth\^lements comprise a same 
instruction issue. 

18. The method for processing the matrixwof elements of claim 17, 
wherein the first and second instructions include a first operation code and the third and 
fourth instructions include a second operation code different frota the first operation code. 



19. The 
wherein the first and second instructs 
fourth instructions include a second operation 



for processing the matrix of elements of claim 17, 

a first operation code and the third and 
'erent from the first operation code. 
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